Mapping of Sequence Reads to the Reference Genomes    ◾    81

2.4.1.4  Extracting Alignments of a Chromosome

Sometimes, we may need to work with alignments of a specific chromosome or a specific

region of the genome. With the following command, you can split the alignments of a

chromosome 11 in a separate file using “samtools view”:

samtools view SRR769545_mem_sorted.bam NC_000011.10 > chr11_human.

sam

You can use any of the reference sequence names in the RNAME field of the SAM/BAM

file, so you may need to display the content of the file to check how the reference sequences/

chromosomes are named.

2.4.1.5  Filtering and Counting Alignment in SAM/BAM Files

To filter alignments in a SAM/BAM file, we can use “samtools view” with “grep” which is a

Linux command for searching plain-text datasets for lines that match a regular expression.

For instance, to search for the alignments with chimeric reads, which are tagged as “SA:”

an optional SAM/BAM field, we can use the following:

samtools view SRR769545_mem_sorted.bam | grep ‘SA:’ | less -S

The chimeric read is the one that aligns to two distinct portions of the genome with little

or no overlap.

To count the number of chimeric reads, we can use “wc -l” command.

samtools view SRR769545_mem_sorted.bam | grep ‘SA:’ | wc -l

We can also use the option “-c” with “samtools view” to count the number of reads in a

BAM file:

samtools view -c SRR769545_mem_sorted.bam

We can use values in FLAG field of the SAM/BAM file to count the number of reads defined

by a specific FLAG value. For instance, since the unmapped reads will be flagged as “0x4”

in BAM files, we can count all mapped reads by excluding the unmapped from counting

using the “-F” option.

samtools view -c -F 0x4 SRR769545_mem_sorted.bam

To count unmapped reads, use the “-f” option instead of the “-F” option as:

samtools view -c -f 0x4 SRR769545_mem_sorted.bam

We can also use the “samtools view” command together with some Unix/Linux com-

mands and pipe symbol “|” to perform more complex count. For instance, we can count